{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Local LLMs with LiteLLM & Ollama\n",
    "\n",
    "In this notebook we'll create two agents, Joe and Cathy who like to tell jokes to each other. The agents will use locally running LLMs.\n",
    "\n",
    "Follow the guide at https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama/ to understand how to install LiteLLM and Ollama.\n",
    "\n",
    "We encourage going through the link, but if you're in a hurry and using Linux, run these:  \n",
    "  \n",
    "```\n",
    "curl -fsSL https://ollama.com/install.sh | sh\n",
    "\n",
    "ollama pull llama3.2:1b\n",
    "\n",
    "pip install 'litellm[proxy]'\n",
    "litellm --model ollama/llama3.2:1b\n",
    "```  \n",
    "\n",
    "This will run the proxy server and it will be available at 'http://0.0.0.0:4000/'."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get started, let's import some classes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dataclasses import dataclass\n",
    "\n",
    "from autogen_core import (\n",
    "    AgentId,\n",
    "    DefaultTopicId,\n",
    "    MessageContext,\n",
    "    RoutedAgent,\n",
    "    SingleThreadedAgentRuntime,\n",
    "    default_subscription,\n",
    "    message_handler,\n",
    ")\n",
    "from autogen_core.model_context import BufferedChatCompletionContext\n",
    "from autogen_core.models import (\n",
    "    AssistantMessage,\n",
    "    ChatCompletionClient,\n",
    "    SystemMessage,\n",
    "    UserMessage,\n",
    ")\n",
    "from autogen_ext.models.openai import OpenAIChatCompletionClient"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set up out local LLM model client."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_model_client() -> OpenAIChatCompletionClient:  # type: ignore\n",
    "    \"Mimic OpenAI API using Local LLM Server.\"\n",
    "    return OpenAIChatCompletionClient(\n",
    "        model=\"llama3.2:1b\",\n",
    "        api_key=\"NotRequiredSinceWeAreLocal\",\n",
    "        base_url=\"http://0.0.0.0:4000\",\n",
    "        model_capabilities={\n",
    "            \"json_output\": False,\n",
    "            \"vision\": False,\n",
    "            \"function_calling\": True,\n",
    "        },\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a simple message class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "@dataclass\n",
    "class Message:\n",
    "    content: str"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, the Agent.\n",
    "\n",
    "We define the role of the Agent using the `SystemMessage` and set up a condition for termination."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "@default_subscription\n",
    "class Assistant(RoutedAgent):\n",
    "    def __init__(self, name: str, model_client: ChatCompletionClient) -> None:\n",
    "        super().__init__(\"An assistant agent.\")\n",
    "        self._model_client = model_client\n",
    "        self.name = name\n",
    "        self.count = 0\n",
    "        self._system_messages = [\n",
    "            SystemMessage(\n",
    "                content=f\"Your name is {name} and you are a part of a duo of comedians.\"\n",
    "                \"You laugh when you find the joke funny, else reply 'I need to go now'.\",\n",
    "            )\n",
    "        ]\n",
    "        self._model_context = BufferedChatCompletionContext(buffer_size=5)\n",
    "\n",
    "    @message_handler\n",
    "    async def handle_message(self, message: Message, ctx: MessageContext) -> None:\n",
    "        self.count += 1\n",
    "        await self._model_context.add_message(UserMessage(content=message.content, source=\"user\"))\n",
    "        result = await self._model_client.create(self._system_messages + await self._model_context.get_messages())\n",
    "\n",
    "        print(f\"\\n{self.name}: {message.content}\")\n",
    "\n",
    "        if \"I need to go\".lower() in message.content.lower() or self.count > 2:\n",
    "            return\n",
    "\n",
    "        await self._model_context.add_message(AssistantMessage(content=result.content, source=\"assistant\"))  # type: ignore\n",
    "        await self.publish_message(Message(content=result.content), DefaultTopicId())  # type: ignore"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set up the agents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runtime = SingleThreadedAgentRuntime()\n",
    "\n",
    "model_client = get_model_client()\n",
    "\n",
    "cathy = await Assistant.register(\n",
    "    runtime,\n",
    "    \"cathy\",\n",
    "    lambda: Assistant(name=\"Cathy\", model_client=model_client),\n",
    ")\n",
    "\n",
    "joe = await Assistant.register(\n",
    "    runtime,\n",
    "    \"joe\",\n",
    "    lambda: Assistant(name=\"Joe\", model_client=model_client),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's run everything!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/tmp/ipykernel_1417357/2124203426.py:22: UserWarning: Resolved model mismatch: gpt-4o-2024-05-13 != ollama/llama3.1:8b. Model mapping may be incorrect.\n",
      "  result = await self._model_client.create(self._system_messages + await self._model_context.get_messages())\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Joe: Joe, tell me a joke.\n",
      "\n",
      "Cathy: Here's one:\n",
      "\n",
      "Why couldn't the bicycle stand up by itself?\n",
      "\n",
      "(waiting for your reaction...)\n",
      "\n",
      "Joe: *laughs* It's because it was two-tired! Ahahaha! That's a good one! I love it!\n",
      "\n",
      "Cathy: *roars with laughter* HAHAHAHA! Oh man, that's a classic! I'm glad you liked it! The setup is perfect and the punchline is just... *chuckles* Two-tired! I mean, come on! That's genius! We should definitely add that one to our act!\n",
      "\n",
      "Joe: I need to go now.\n"
     ]
    }
   ],
   "source": [
    "runtime.start()\n",
    "await runtime.send_message(\n",
    "    Message(\"Joe, tell me a joke.\"),\n",
    "    recipient=AgentId(joe, \"default\"),\n",
    "    sender=AgentId(cathy, \"default\"),\n",
    ")\n",
    "await runtime.stop_when_idle()\n",
    "\n",
    "# Close the connections to the model clients.\n",
    "await model_client.close()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
